Unique Reconstruction of Coded Strings From Multiset Substring Spectra
نویسندگان
چکیده
منابع مشابه
Shortest Unique Substring Queries on Run-Length Encoded Strings
We consider the problem of answering shortest unique substring (SUS) queries on run-length encoded strings. For a string S, a unique substring u = S[i..j] is said to be a shortest unique substring (SUS) of S containing an interval [s, t] (i ≤ s ≤ t ≤ j) if for any i′ ≤ s ≤ t ≤ j′ with j − i > j′ − i′, S[i′..j′] occurs at least twice in S. Given a run-length encoding of size m of a string of len...
متن کاملString Reconstruction from Substring Compositions
Motivated by mass-spectrometry protein sequencing, we consider the problem of reconstructing a string from the multisets of its substring composition. We show that all strings of length 7, one less than a prime and one less than twice a prime, can be reconstructed uniquely up to reversal. For all other lengths, we show that unique reconstruction is not always possible and provide sometimes-tigh...
متن کاملComputing Longest Common Substring and All Palindromes from Compressed Strings
This paper studies two problems on compressed strings described in terms of straight line programs (SLPs). One is to compute the length of the longest common substring of two given SLP-compressed strings, and the other is to compute all palindromes of a given SLPcompressed string. In order to solve these problems efficiently (in polynomial time w.r.t. the compressed size) decompression is never...
متن کاملShortest Unique Substring Query Revisited
We revisit the problem of finding shortest unique substring (SUS) proposed recently by [6]. We propose an optimal O(n) time and space algorithm that can find an SUS for every location of a string of size n. Our algorithm significantly improves the O(n) time complexity needed by [6]. We also support finding all the SUSes covering every location, whereas the solution in [6] can find only one SUS ...
متن کاملApproximate Substring Matching over Uncertain Strings
Text data is prevalent in life. Some of this data is uncertain and is best modeled by probability distributions. Examples include biological sequence data and automatic ECG annotations, among others. Approximate substring matching over uncertain texts is largely an unexplored problem in data management. In this paper, we study this intriguing question. We propose a semantics called (k, τ)-match...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Information Theory
سال: 2019
ISSN: 0018-9448,1557-9654
DOI: 10.1109/tit.2019.2935973